Backport subject "riscv: Memory Hot(Un)Plug support"#220
Open
uestc-gr wants to merge 12 commits intoRVCK-Project:rvck-6.6from
Open
Backport subject "riscv: Memory Hot(Un)Plug support"#220uestc-gr wants to merge 12 commits intoRVCK-Project:rvck-6.6from
uestc-gr wants to merge 12 commits intoRVCK-Project:rvck-6.6from
Conversation
mainline inclusion from mainline-Linux 6.8-rc1 commit ff172d4 category: feature bugzilla: RVCK-Project#219 -------------------------------- This will allow better TLB utilization and then should be more performant. Before: ---[ vmemmap start ]--- 0xffff8d8002000000-0xffff8d8012000000 0x000000046ec00000 256M PTE . .. .. D A G . . W R V ---[ vmemmap end ]--- After: ---[ vmemmap start ]--- 0xffff8d8002000000-0xffff8d8012000000 0x000000046ec00000 256M PMD . .. .. D A G . . W R V ---[ vmemmap end ]--- Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231214132935.212864-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion from mainline-Linux 6.10-rc6 commit e3ecf2f category: feature bugzilla: RVCK-Project#219 -------------------------------- Make sure that the altmap parameter is properly passed on to vmemmap_populate_hugepages(). Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240605114100.315918-2-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion from mainline-Linux 6.10-rc6 commit 6667309 category: feature bugzilla: RVCK-Project#219 -------------------------------- The RISC-V port copies the PGD table from init_mm/swapper_pg_dir to all userland page tables, which means that if the PGD level table is changed, other page tables has to be updated as well. Instead of having the PGD changes ripple out to all tables, the synchronization can be avoided by pre-allocating the PGD entries/pages at boot, avoiding the synchronization all together. This is currently done for the bpf/modules, and vmalloc PGD regions. Extend this scheme for the PGD regions touched by memory hotplugging. Prepare the RISC-V port for memory hotplug by pre-allocate vmemmap/direct map/kasan entries at the PGD level. This will roughly waste ~128 (plus 32 if KASAN is enabled) worth of 4K pages when memory hotplugging is enabled in the kernel configuration. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240605114100.315918-3-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion from mainline-Linux 6.10-rc6 commit fe122b8 category: feature bugzilla: RVCK-Project#219 -------------------------------- Prepare for memory hotplugging support by changing from __init to __meminit for the page table functions that are used by the upcoming architecture specific callbacks. Changing the __init attribute to __meminit, avoids that the functions are removed after init. The __meminit attribute makes sure the functions are kept in the kernel text post init, but only if memory hotplugging is enabled for the build. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240605114100.315918-4-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion from mainline-Linux 6.10-rc6 commit 007480f category: feature bugzilla: RVCK-Project#219 -------------------------------- Add a parameter to the direct map setup function, so it can be used in arch_add_memory() later. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240605114100.315918-5-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion from mainline-Linux 6.10-rc6 commit 6e6c5e2 category: feature bugzilla: RVCK-Project#219 -------------------------------- The pfn_to_kaddr() function is used by KASAN's memory hotplugging path. Add the missing function to the RISC-V port, so that it can be built with MHP and CONFIG_KASAN. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240605114100.315918-6-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion from mainline-Linux 6.10-rc6 commit c75a74f category: feature bugzilla: RVCK-Project#219 -------------------------------- For an architecture to support memory hotplugging, a couple of callbacks needs to be implemented: arch_add_memory() This callback is responsible for adding the physical memory into the direct map, and call into the memory hotplugging generic code via __add_pages() that adds the corresponding struct page entries, and updates the vmemmap mapping. arch_remove_memory() This is the inverse of the callback above. vmemmap_free() This function tears down the vmemmap mappings (if CONFIG_SPARSEMEM_VMEMMAP is enabled), and also deallocates the backing vmemmap pages. Note that for persistent memory, an alternative allocator for the backing pages can be used; The vmem_altmap. This means that when the backing pages are cleared, extra care is needed so that the correct deallocation method is used. arch_get_mappable_range() This functions returns the PA range that the direct map can map. Used by the MHP internals for sanity checks. The page table unmap/teardown functions are heavily based on code from the x86 tree. The same remove_pgd_mapping() function is used in both vmemmap_free() and arch_remove_memory(), but in the latter function the backing pages are not removed. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240605114100.315918-7-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion from mainline-Linux 6.10-rc6 commit 37992b7 category: feature bugzilla: RVCK-Project#219 -------------------------------- During memory hot remove, the ptdump functionality can end up touching stale data. Avoid any potential crashes (or worse), by holding the memory hotplug read-lock while traversing the page table. This change is analogous to arm64's commit bf2b59f ("arm64/mm: Hold memory hotplug lock while walking for kernel page table dump"). Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240605114100.315918-8-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion from mainline-Linux 6.10-rc6 commit f8c2a24 category: feature bugzilla: RVCK-Project#219 -------------------------------- Enable ARCH_ENABLE_MEMORY_HOTPLUG and ARCH_ENABLE_MEMORY_HOTREMOVE for RISC-V. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240605114100.315918-9-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion from mainline-Linux 6.10-rc6 commit 0546d70 category: feature bugzilla: RVCK-Project#219 -------------------------------- Now that RISC-V has memory hotplugging support, virtio-mem can be used on the platform. Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240605114100.315918-10-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion from mainline-Linux 6.10-rc6 commit 216e04b category: feature bugzilla: RVCK-Project#219 -------------------------------- ZONE_DEVICE pages need DEVMAP PTEs support to function (ARCH_HAS_PTE_DEVMAP). Claim another RSW (reserved for software) bit in the PTE for DEVMAP mark, add the corresponding helpers, and enable ARCH_HAS_PTE_DEVMAP for riscv64. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240605114100.315918-11-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
mainline inclusion from mainline-Linux 6.10-rc6 commit 4705c15 category: feature bugzilla: RVCK-Project#219 -------------------------------- Now that DAX is usable, enable the DAX VMEMMAP optimization as well. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240605114100.315918-12-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Gao Rui <gao.rui@zte.com.cn>
|
开始测试 log: https://github.com/RVCK-Project/rvck/actions/runs/21853554165 参数解析结果
测试完成 详细结果:RVCK result
Kunit Test Result[06:01:07] Testing complete. Ran 457 tests: passed: 445, skipped: 12 Kernel Build ResultKernel build succeeded: RVCK-Project/rvck/220/ bec60930b515d16c1ed7d15878479b54 /srv/guix_result/8cf9dd2288d3d01b51c734b2d91539d68c629a35/Image LAVA Checkargs:
result:Lava check done! lava log: https://lava.oerv.ac.cn/scheduler/job/1413 lava result count: [fail]: 174, [pass]: 1435, [skip]: 290 Check Patch Result
|
Contributor
Author
|
该pr已经完成并自测通过,所有补丁均来自上游社区,请老师评审 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
issues: #219
这个补丁实现了RISC-V架构在运行时动态调整物理内存容量的能力,同时支持设备内存区域管理,这是数据中心和云计算环境中提高资源利用率和灵活性的重要特性。
按照如下方法验证
1、qemu合入以下补丁
https://lore.kernel.org/qemu-devel/20240521105635.795211-1-bjorn@kernel.org/
2、内核打开CONFIG_MEMORY_HOTPLUG和CONFIG_MEMORY_HOTREMOVE和CONFIG_VIRTIO_MEM配置
3、启动虚拟机,配置qemu monitor和virtio-mem设备
qemu-system-riscv64
-nographic -machine virt
-smp 8
-M virt -cpu rv64
-m 16G,slots=3,maxmem=32G
-object memory-backend-ram,id=mem0,size=16G
-blockdev node-name=pflash0,driver=file,read-only=on,filename=RISCV_VIRT_CODE.fd
-blockdev node-name=pflash1,driver=file,filename=RISCV_VIRT_VARS.fd
-kernel ./rvck/arch/riscv/boot/Image
-monitor unix:/tmp/qemu-monitor.sock,server,nowait
-object memory-backend-ram,id=vmem0,size=2G
-device virtio-mem-pci,id=vm0,memdev=vmem0,node=0
.......
4、在qemu monitor 中执行下述命令,完成内存的热插入和热拔出
qom-set vm0 requested-size XX
5、在riscv虚拟机中执行下述命令,将新增内存online
echo 1 > /sys/devices/system/memory/memoryX/online